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Improved Inter-Processor Communication System for Communication between Processors 
Field of the invention 

The present invention concerns general!/ the communication between two or more processors, In 
5 particular, the present invention concerns the inter-processor communication between processors 
that are arranged on the same semiconductor die. 

Background of the invention 

As die demand fox more powerful computing devices increases, more and more systems are offered 
1 0 that comprise more than just one processor. 

For die purposes of the present inv?nuon, a distinction is to be made between computer systems 
that comprise two or more discrete processors and systems where two or more processors are 
integrated on the same chip, A computer with a main central processing unit (CPU) on a mother 
15 boaxd and an algorithmic processor on a graphics card is an example for a computer system with two 
discrete processors. Another example of a computer system with several discrete processors is a 
parallel computer where an array of processors is arranged such that an improved performance is 
achieved. For sake of simplicity, systems on a board with two or more discrete, processors are also 
considered to belong to the same category. 

20 

There are systems where two or more processors axe integrated on the same chip or semiconductor 
die. A typical example is a SmarrCard (also referred to as integrated circuit card) that has a main 
processor and a ceypto-processor on the same semiconductor die, 

25 As small handheld devices are becoming more and more popular, the demand for powerful and 
flexible chips is increasing, A typical example is the rrJttdar phone which in the beginning of its 
dissemination was just a telephone for voice transmission (analogue communication) , Over the years 
ad ditional features have been added and most of today s cellular phones are dpgSg*prl for voice and 
data services. Ad d itional differentiators are wireless application protocol (WAP) support, short 

30 message system (SMS), and multimedia message service (MMS) functionality, just to nan^ ^me of 
the more recent developments. All these features require more powerful processors and quite often 
even dual-processor or multi-processor chips, 
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In die fixture, systems handling digital video streams for example will became available. These 
systems also require powerful and flexible chip sets. 

Other examples are integrated circuit cards, such as multi-purpose JavaCards, small handheld 
5 devices, such as palm top computers or personal digital a s si st ants (PDAs), video and audio devices, 
devices for use in automotive^ and so forth. 

It is essential for such dual-processor or multi-processor chips that there exists a communication 
channel for efficient inter-processor communication. Hie expression "inter-processor 
1 0 communication" is herein used as a synonym for any communication between a rose processor 
and/or system resources associated with this first processor and a second processor and/or system 
resources associated with this second processor. A shared memory (eg., a random access memory) is 
an example of a system resource that usually needs to be accessible by all processors of a chip, 

1 5 System resources have to be shared in an efficient manner in dual-processor or multi-processor chips 
where the processors operate in parallel on die same aspect of a task or on different aspects of the 
same task. The sharing of resources may also be necessary in applications where processors are called 
upon to process relaxed data, 

20 An example of a multi-processor system is given In the European Patent appl ica tion EP 0 580 961- 
Al, filed on 1 6 April 1993. This Parent application concerns a system with multiple discrete 
processors and a global bus that is shared by all these processors. Enhanced processor interfaces are 
provided for linking the processors to the common bus. Such multi-processor systems with a global 
bus cannot be realized using RISC processors, due to the high bus load which would have an impact 

25 on the system's performance. The multi-processor system presented in EP 0 580 961-A1 is powerful" 
but complicated and expensive to implement. The shown structure cannot be used in multi- 
processor systems on a common die. 

Another system is proposed in US Patent US 4,866,597, filed on 26 April 1985- This US Parent 
30 concerns a multi-processor system where each processor has hs own processor bus. Daxaare 
exchanged between these processors via first-in-first-out data buffers (FIFO) which directly 
interconnect the respective processor buses. It is a disadvantage of this approach chat the size of the 
buffers increases dramatically with the amount of data to be transferred. 



35 US Patent 5,093,780 concerns an inter-processor transmission system that has a data link which 
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automatically reads and wires transfer data. A direct memory access (DMA) unit and a transmitter 
are assigned to a first pro eessor and a receiver together with a DMA. unit are assigned at a second 
processor. The processor has to set up the transfer by programming the corresponding DMA. That 
is, the processor has to know upfront whether data are to be transferred. This is a disadvantage of 
5 the described inter-processor transmission system* since the respective processor needs to be 

involved. Another disadvantage of die said system is the fact that the whole transmission is mono- 
directional, Le. „ the implementation is asymmetric It is just possible to transfer data from the 
memory 16 on the left hand side of Figure 4 to the memory 26 on the right hand side, 

10 A DMA controller for a mulri-mictocomputer system is disclosed in US patent 5,222,227, The 
DMA controller has the function of controlling data transfer operations that are executed by the 
microcomputer systems. Separate address and data pipelines are provided. TrWtate-TechnoIogy is 
used for the buses. The buses CDB and SDB are at least temporarily electrically interconnected- As a 
consequence, both buses have to be operated at the same clock speed and both buses have to have 

15 the same bus width. According to the US patent 5,222,227, only homogeneous buses can be 
interconnected- There is no external DMA channel used in die system presented. 

A multi-processor system with a shared memory is described and claimed in US Parent US 
5,283,903, filed on 17 September 1991. Hie system in accordance with this US Patent comprises a 

20 plurality of processors, a shared memory (main memory), and a prioriiy selector unit. The priority 
selector unit arbitrates between those processors die request access to the shared memory. This is 
necessary, since the shared memory is a single-port memory (e,g., a random access memory) that 
cannot handle simultaneous and competing requests from several processors. It is a disadvantage of 
this approach that the shared memory is expensive as only intermediate storage. The shared memory 

25 can get large with high data transfer. 

Another multi-processor system is described in US Patent US 5,289,588, filed on 24 April 1990. 
The processors are coupled by a common bus. They can access a shared memory via this common 
bus. A cache is associated with each processor and an arbitration scheme is employed to control the 
30 access to the shared memory. It is a disadvantage of this approach that the cache memory is 
expensive as only big caches give a real performance boose In addition, bus conflicts lead to a 
reduced performance of each processor. 



35 



A microprocessor architecture is described in die PCT Patent application PCI7JP92/00869, filed on 
7 July 1992, and published under PCT Publication number WO 93/01553. The architecture 
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supports multiple heterogeneous processors which are coupled by data, address, and control signal 
buses. Access to a memory is controlled by arbitration circuits. 

Some of the known multiprocessor systems use architectures where the inter-processor 
5 communication occupies part of the processor's processing cycles, It is desirable to avoid this 
overhead and to free-up the processors processing power in order to be able to better exploit the 
processor's capabilities and performance. 

Other known schemes cannot be used for integrated multi-processor systems where two or more 
10 processors are located within the same chip. 

It is yet another disadvantage of some known systems that they are asymmetric in their 
implementation which means that different implementations are required for each processor. 
Furthermore, the effort for formal verification is greater for asymmetric than for symmetric 
I? implementations. 



• • • 
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Summary of die Invention 

It is an object of the present invention to provide a scheme for efficient data transfer between two or 
more processors and/or their associated components. 

5 It is an object of the present invention to provide an uniprocessor data transfer scheme that is 
suited for the integration into a semiconductor die. 

These and other objectives are achieved by the present invention which provides a system that 
comprises at least two integrated processors. According to the present invention, these two 

10 processor? are operably connected via a communication channel for exchanging information. One 
processor (Pi) has a processor bus, a shareable unit, and a DMA. unit with two external DMA 
channels. The DMA unit and the shareable unit are connected to the processor bus. The other 
processor ako has a shareab le unit and a DMA unit with wo external DMA channels. 
Programmable units are employed enabling the processor to set-up the desired communication 

15 links. Due to this arrangement, two bi-directional communication ^"titI are establishable between 
the two bus regimes. 

The two or more processor can be arranged on a common semiconductor die. This allows to realise 
computing devices, such as PDAs, handheld computers, palm top computers, n»nnl ar phones, and 
20 cordless phones, for example. 

The communication channel can be used advantageously for communication between two or more 
processors and/or their associated components. The inventive arrangement suits general multi-core 
communication needs. The arrangement is highly- symmetrical and it allows to minimise the 
25 number of otherwise needed bus masters for each processor. The present scheme is expandable and 
very flexible. 



These and other aspects of the invention will be apparent fiom and elucidated with reference to the 
embodiment (s) described hereinafter, 
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Brief description of the drawings 

For a more complete description of the present invention and for further objects and advantages 
thereof, reference is made to the following description, taken in conjunction with the accompanying 
5 drawings, in which: 

FIG, 1 is a schematic block diagram of a dual-processor computer system, according to a 
first embodiment of the present invention. 

10 PIG, 2 is a schematic Illustration of an inter-processor communication system according to the 
present invention. 

FIG. 3 is a detailed block diagram of the inter-processor communication 
system of Figure 2. 

FIG. 4 is a schematic block diagram of a dual-processor computer system, according to 
another embodiment of th<? present invention. 



15 



FIG. 5 is a schematic block diagram of a dual-processor computer system, according to 
20 another embodiment of the present invention. 

FIG- 6 is a derailed block diagram of the DTU unit of Figure 5. 



25 
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DESCRIPTION OF PREFERRED EMBODIMENTS 

The present invention is described in connection with several embodiments. 

5 As shown in Figure 1, a dual-processor system co which the present invention is applied comprises a 
first processor PI that is connected via a firsr processor bus 10 to a first shareable unit 13, A 
processor bus (also called microprocessor bus) is die main path connecting to the computer system's 
processor. An example of a shareable unit 13 or 23 is a shared memory (e.g. 3 a random access 
memory; RAM). The first processor bus 10 is a 64 bit, 20MHz bus. The system comprises a second 

10 processor P2 that also has a processor bus 20. This second processor bus 20 is a 64 bit, 66MHz bus. 
An interconnection between the two processor environments 18 and 28 (schematically illustrated by 
ovals In Figure 1) is established via two bi-directional communication channels 1 1 and 21. The first 
bi-directional channel 11 is programmable by the processor PI, as indicated by the arrow 12, and 
the second channel 21 is programmable by the processor P2, as indicated by the arrow 22. The two 

15 bi-directional channels 1 1 and 21 are hereinafter referred no as interoore communication system 9. 

More details of the first embodiment are depicted in Figure 2. The interoore communication system 
29 comprises a first DMA unit 45 (DMA!) with a first and a second external DMA channel 46, 47. 
The first DMA unit 45 is conneccable to the first processor bus 1 0 via an internal DMA channel 49. 

20 It furthermore comprises a first double tandem unit (DTU) 34 (DTUI) which is conneccable via 
the first external DMA channel 47 to the first DMA unit 45. The DTU unit 34 is programmable 
by the first processor Pi, as indicated by the arrow 32, and the first DMA unit 45 is programmable 
by the first processor PI, as indicated by the arrow 132 In addition, the interoore communication 
system 29 comprises a second DMA unit 35 (DMA2) and a second DTU unit 44 (DTU2). The 

25 DMA unit 35 has a first and a second external DMA channel 36, 37, and an internal DMA channel 
39. The second DMA unit 35 is connected via the internal DMA channel 39 to the second 
processor bus 20. The second DMA unit 35 and the second DTU unit 44 are conneccable via the 
first external DMA channel 37- The DTU unit 44 is programmable by the second processor P2, as 
indicated by the arrow 42, and the second DMA unit 35 is programmable by the second processor 

30 P2, as indicated by the arrow 142 A first bi-directional communication channel is implemented by 
the first DTU 34 and a second bi-directional channel is implemented by the second DTU 44. Each 
DTU 34> 44 is direcdy connectable to the processor for prograniming/ configuration purposes, as 
illustrated in Figure 2. An interconnection between die two processor environments 38 and 48 
(schematically illustrated by ovals in Figure 2) is established by a bi-directional data transfer. 

35 
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It is a navel feature of the embodiment given in Figures 1 and 2 thar the programming of one TXTU 
unit, eg. DTUl, allows (binjirecrinnal) data transfer from the processor environment 48 to die 
processor environment 38 and vice versa without the programming of any other resource. Data can 
be moved to the DMA1 and fetched from the shareable unit 23 that is attached to the second 
5 processor bus 20. The DTU2 is able to move data to the DMAZ and to fetch data torn the 
shareable unit 13 that is attached to the first processor bus 10. 

The dual-processor arrangement illustrated in Figures 1 or 2 allows the second processor P2 to 
access the shareable unit 13. The shareable unit 23 is accessible by the processor P 1 . 

10 

In more general terms, one processor (processor P2 in the present embodiment) of a multi-processor 
system in accordance with die present invention is able to access resources (die shareable unit 13 in 
the present embodiment) chat arc associated with another processor (processor Pi in the present 
embodiment. A resource of another processor on a remote bus may be accessed for data up and 
15 download from cheap remote memory, for inftnmcr, A processor may for example access the 

memory of a co-processor to fetch data that were computed by the co-processor. These are just wo 
typical examples of situations where a first processor accesses resources on a remote bus, 

Various types of processors can be interconnected using the present scheme. It allows to realise chips 
20 with multiple homogeneous processors or even with multiple heterogeneous processors. The word 
processor is herein used as a synonym for any processing unit that can be integrated Into a 
semiconductor chip and that actually executes instructions and works with data. 

Complex, instruction set computing (CISC) is one of the two main types of processor designs in use 
25 tqday. It is slowJ^fiimng popularity to reduced instruction set computing (RISC) designs. TEe most 
popular current CISC processor is the x86, but there are also 68305, 65xx, and Z8O5 in use* 

Currently, the fastest processors are RISC-based. There are several popular RISC processors, 
including Alphas (developed by Digital and currently produced by Digital/Compaq and Samsung), 
30 ARMs (developed by Advanced RISC Machines, currently owned by Intel, and currently produced 
by both the above and Digital/ Compaq) , PA-RISCs (developed by Hewlett-Packard), PowerPCs 
(developed in a collaborative effort between IBM, Apple* and Motorola), and SPARCs (developed 
by Sum the SPARC design is currently produced by many different companies). 



35 ARMs are different from most other processors in that they were not designed to maximise 
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performance but rather to maximise perfronance per power consumed. Thus ARMs find m ost of 
chcir use on hand-held machines and PDAs. . 

In die above sections some examples of the processors were given that can be interconnected in 
5 accordance with the present invention- Also suited are Digital Signal processors (DSPs), the 

processor core? of any of the known processors, and customer specific processor designs. In other 
words, die present concept is applicable to most microprocessor archkectures. One can even 
interconnect a processor with a slow processor bus and a processor with a fast processor bus. 

10 For the purpose of the present application, the following is also considered to be a processor: central 
processing unit (CPU), microprocessor, digital signal processor (DSP), system controller (SQ, co- 
processor, auxiliary processor, control unit and so forth. 

A direct memory access (DMA) unit is a unit thar is designed for passing data from a memory to 
1 5 another device without passing it through die processor. A DMA. typically has one or more 

dedicated internal DMA channels and one or more dedicated external DMA chan nels for external 
peripherals. Such an external DMA channel — contrary to an internal DMA channel chat is 
controlled by die processor to which it is associated - is set-up by external agents in order for the 
remote processor to ger access to another processor s shareable unit, For instance, a DMA allows 
20 devices on a processor bus to access memory without requiring intervention by the processor, 

Examples of shareable units are: volatile memory, non-volatile memory, peripherals, interfaces, input 
devices, outpur devices, and so forth. 

25 The incercore communication system, according to the present invention, decouples the data flow 
between die dock domain of a first processor Pi and the clock domain of a second processor P2. 
This means that within the limits of the inventive data transfer system, die activity on one pro ces sor 
does not require simultaneous and equivalent activity on the other processor. 

30 Details of the mtercore communication system 29, according to the presenr invention, are described 
in connection with Figure 3. The iutercore communication system 29 comprises a first DTU 34 
(DTUl), a second DTU 44, (DTU2), a first DMA 45 (DMA1), and a second DMA 35 pMA2). 
In the presenr embodiment, the DMA unit 35 comprises two eacternal DMA channel units 56, 57. 
The incemal channel 39 of these two external DMA channel units 56, 57 is connected to the 

35 processor bus 20. 
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The first external DMA channel unit 56 is connected via a link 36 to the second DTU 44. The 
second external DMA channel unit 57 is connected via a link 37 to die first DTU 34, The first 
DMA unit 45 comprises two external DMA channel units 54, 55. The internal channel 0 of these 
5 two external DMA channel units 54, 55 is connected to the processor bus 10. The first external 
DMA channel unit 55 is connected via a link 47 to die first DTU 34. The second external DMA 
channel unit 5 4 is connected via a link 46 to the second DTU 44. The internal channel 49 of these 
two external DMA channel units 54, 55 is connected to the processor bus 10. 

10 The DTU 34 comprises a first processor interface 60 allowing a programming link 52 to be 
established via die processor bus 10 to the processor PI (not show* in Figure 3). The DTU 34 
further comprises a direct access unit core (DAU core) 62, and two external DMA c hann el iniexfeces 
61 and 63. The external DMA channel interface 61 serves as interface to the external DMA channel 
unit 55 and the external DMA channel interface 63 serves as interface to the external DMA channel 

15 unit 57. 

The DTU 44 comprises a first processor interface 50 allowing a progra mm i n g link 51 to be 
established via the processor bus 20 to the processor P2 (not shown in Figure 3). The DTU 44 
further comprises a direct access unit core (DAU core) 5 2, and two external DMA channel interfaces 
20 51 and 53. The" external DMA channel interface 51 serves as interface to the external DMA channel 
unit $£ and die external DMA chann el interface 53 serves as interface to the external DMA channel 
unit 54. 



Hie clock signal of the first processor Fl (clockl) is fed via a dock line 58 to the following units: 
25 external DMA channel unit 54, extetnaTDMA channel unit 55, external JUMA channel interface 
53, external DMA channel iater&ce 61, and DAU core 62. The clock signal of the second processor 
p2 (clock2) is fed via a clock line 59 to the following units: external DMA channel unit 56, external 
DMA channel unit 57, external DMA channel interface 51, external DMA channel interface 63, 
and DAU core 52, 

30 

The processor P 1 i^nG^r** the first DTU 34 by means of die first processor interface 60- The 
DAU core 62 of the DTU 34 is the control logic for the wo external channel interface units 61 and 
63, The DAU core 62 furthermore performs the data transfers ideally enh a n c ed by a first-in first-out 
(FIFO). The same way the processor P2 configures the second DTU 44 via the second processor 
35 interface 5 0. In both cases the external channels of the first DMA unit 45 use the resources of the . 
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internal DMA channel 49 on the processor bus 1 0, and the external channels of the second DMA 
unit 35 uscthe resources of the internal DMA channel 39 on the processor bus 20, 

As illustrated in Figure 3, the intsercore communication system 29 provides for a dock decoupling. 
5 All the Modes are either docked by the dockl of the processor PI or by the dock2 of the processor 
P2 such that the acdviiy on one processor does not require simultaneous and equivalent activity on 
che other processor. 



In cases wlierc there is no phase and/or frequency relariionship between che signals dockl and 
10 dock2, the DAU cores 52, 62 can be implemented such that they ate enabled to provide safe daxa 
transfer by means of appropriate handshaking signals. These hnrtJshdtring signals are active between 
the DAU core 52 and the external DMA channel interface 53 as well as 
between the DAU core 62 and the external DMA channel interface 63, 

The external DMA channel toterraces and/ox the DAU cores can be standardised, In other words, 
1 5 each DTU or DMA, according to die present invention, may contain an identical functional core. 
Only the processor inter&ce has to be adapted depending on the actual processor and/or processor 
bus employed, This leads to a reduced devdopment rime due co maximising of re-use and reduced 
verification effort 

20 According to the present invention, a DMA unit is connected via its internal interfkee to a processor 
bus and via its external mtWace co a DTU, The external interface may be 8 bits wide. 

The processor interface has a prograrnrning input (e.g. input 52 in Figure 3), once this interface 
serves for the programming of che DTU in which ir is comprised. The processor interface does not 
25 require any dam link to the processor bus, since any data exchanged is handled by the DTUs 

ejoernal DMA channel interfeces. The setup and configuration of the bi-directional channel is done 
by a processor by programming via the processor interlace the respective DTUs DAU core. 

The DTU 34, for instance makes use of the external DMA channel 47 in order to transfer 
30 information (data and/or control information) to and from the shareable unit 13. 

Another embodiment is illustrated in Figure 4. A system is illustrated that comprises a first processor 
PI, a first processor bus 70, and a first shareable unit 76 being connected to the first processor bus 
70, There is a second processor P2, a second processor bus 80 and a second shareable unit 86 
35 attached thereto. A first bidirectional cxsrninunication channel is establishable via a first DMA unit 
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83 with an external DMA channel 85. The fist DMA unit 83 is conneoahle to the first processor 
bus 70. A first DTU unit 82 is provided. The first DTU unit 82 is coaneetable via the external 
DMA channel 85 to the first DMA unit 83. The DTU unit 82 is programmable by die first 
processor Pi. The programming takes place via the processor bus 70 and a programming link 84, 

5 Furthermore, a master unit Cmaster2) 81 is provided. This master unit 81 serves as an interface 

between the first DTU 82 and the second processor bus 80. A second bi-directional cammuniuarion 
channel is establfahable via a second DMA unit 73 with an external DMA channel 75. The second 
DMA unit 73 is connectable to die second processor bus 80. A second DTU unit 72 is provided. 
Hie second DTU unit 72 is eonnecrable via the external DMA channel 75 to the second DMA 

10 unit 73. The DTU unit 72 is programmable by the second processor P2. The programming take 
place via the processor bus SO and a programming lhik74. Furthermore, a master uniF(fcasierl) 71 
is provided. This master unit 71 serves as an interface between die second DTU 72 and the first 
processor bus 70. A master in me present contra is a unit being able to initiate (and continue) data 
transfers on the processor buses. The masters therefore need to have access to some kind of 

15 arbitration (prioritization) on the buses (this prioritization is not part of the present patent 
application). The masters have an addressing circuitry used to select the active device on the 
processor bus. 



Another embodiment is illustrated in Figure 5. An intcrcore communication system 99 is provided 
20 that allows to establish two bi-directional channels between the two processor busses 90 and 1 00, 
The processor Pi may be a digital signal processor (DSP) core, and the processor P2 may be a 
system controller (SO core, for example. In the present embodiment, there is one common DTU 
unit 92 which for esample comprises the funcnonal elements of the DTU1 and DTU2 of Figures 2, 
3, or 4. One part of this common DTU 92 is programmable by the processor Pi, as indicated by 
-£5 — the arrow 1041 and the other part is -progtamnablelytiB^^ 
94. Details of this DTU 92 are depicted in Figure 6. 

The common DTU 92 comprises a first processor interface 120 allowing a programming link 104 
to be established via the processor bus 90 to the processor PI (not shown in Figure fi). The DTU 92 

30 further comprises a direct access unit core (DAU core) 122. and two external DMA channel 

interfkees 121 and 123. The external DMA channel interface 121 serves as Interface to die DMA1 
unit 101 and the external DMA channel interface 123 serves as interface to the DMA2 unit 93. The 
DTU 92 further comprises a second processor interface 1 10 allowing a programming Jink 94 to be 
established via the processor bus 100 to the processor P2 (not shown in Figure <S). The DTU 92 

35 further comprises another direct access unit core (DAU core) 1 12, and two external DMA channel 
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interfaces 111 and 113. The external DMA channel interface HI serves as interface to the DMA2 
unit 93 and the external DMA channel interface 113 serves as interface id the DMA1 unit 101. 
How the two dock signals dockl and dock2 are applied is also shown in Figure 6. 

5 Hie DTU 92 programming is preferably done using two separate register sets, each register set being 
assigned by one processor, pi or P2. This allow to avoid conflict* with simultaneous accesses 
performed by* the wo DAU cores 112 and 122. However, a prioritization scheme is required that 
allows to prioritise requests from the processor Pi or requests from the processor P2. The following 
two schem e s are proposed: 
10 - No priority is specified and the operation is based on the first come first serve principle, Le., 

the processor that comes fist has the priority over the other processor. Ongoing transfers are 
always completed and new transfers are pur on a waiting queue*. 
- Hie processor Pi has priority over the processor P2, or vice versa- An ongoing transfer of 
low priority data can be interrupted by a request submitted by the processor that has the 
1 5 higher priority. The incerrupr of the transfer happens transparent to the low priority core. 

After the high priority request has finished, the low priority request is resumed. A high 
priority transfer is never interrupted. 



20 



According to the present invention, the DTU units make use of external DMA channels to transfer 
dam to/from the shareable unit that is opnnectable to the processor bus of the other processor, Such 
an external DMA channel, contrary to the internal DMA channels which are programmed by the 
respective processor, are set-up by external agents in order to get access to the resources of the other 
processor. The external agents in this patent application are the commands programmed by a 
remote processor to have access to a resource on the local processor - the internal DMA channels are 
25 programmed by the local processor ksel£ 

The present invention can also be employed in systems with more than two processors. A third 
processor might be connected via its own processor bus, a third DMA3 unit and a third DTU3 to 
the DMA2 unit of the second processor, for example. This would allow the third processor to 
establish a bidirectional channel to resources thai are associated with the second processor. 



30 



In yet another embodiment of the invention, two or more processors and a communication channel 
for inter-processor communicajdon in accordance with die present invention, are integrated into a 
custom application specific integrated circuit (ASIQ. 



35 



It is an advantage of the architecture presented ph4 claimed herein that it supports multiple 
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heterogeneous processors. The inventive scheme can be expanded co suit general multi-core 
communicarion needs. Due to the present invention, die number of bus masters for each processor 
can be reduced, as potentially available DMA units can be used for this purpose. The concept and 
design reuse is another advantage. Different other advantages have been mentioned in connection 
5 with the various embodiments of die present invention. 

The proposed architecture is symmetric and applicable to most microprocessor architectures. It can 
be expanded to multi-core architectures, i.e., it is independent of the number of cores. 

1 0 The present invention is well suited for use in computing devices, such as PDAs, handheld 

computers, palm top computers, and so forth, It is also suited for" being used in cellular phonerfeg., 
GSM phones), cordless phones (e.g., DECT phones), and so forth. The arcJuteecuie proposed 
herein can be used in chips or chip sets for the above devices or chips for Blue tooth applications. 

15 It is appreciated that various features of the invention -which are, tar darity» described in the context 
of separate embodiments may also be provided in combination in a single embod ime nt. Conversely, 
various features of die invention which are, for breviryy described in the context of a single 
embodiment may also be provided separately or in any suitable sub combination. 

• 20 In the drawings and specification there has been set forth preferred embodiments of the invention 
and, although specific terms are used, the description thus given uses terminology in a generic and 
descriptive sense only and nor for purposes of limitation. 
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I- 1 T System comprising 

a fist processor bus (10; 70; 90), 

- a first processor (Pi) being connectable to the first processor bus (10; 70; 90), 

- a fist direct memory access unit (45; 83; 101) with a first external direct memory 

5 access channel (47; 85; 1 OS), the first direct memory access unit (45; 83; 101) being 

connectable to the first processor bus (10; 70; 90), 

- a first programmable unit (34; 82; 92) being correctable via the first external direct 
memory access channel (47; 85; 106) to the first direct memory access unit (45; 83; 
101), said first programmable unit (34; 82; 92) being programmable by the first 

10 processor (Pi), 

- a first shareable unir (13; 76; 93) being connectable to the first processor bus (1 0; 70; 
90), 

a second processor bus (20; 80; 1 00), 

a second processor (P2) being connectable to the second processor bus (20; 80j 100), 
J 5 - a second direct memory access unit (35; 73; 93) with a second escernal direct memory 

access channel (36; 75; 96), the second direct memory access unit 35; 73; 93) being 
connectable to the second processor bus (20; 80; 100), 

- a second programmable unit (44; 72; 92) being connectable via the second external 
direct memory access channel (36; 75; 95) to the second direct memory access unit (35; 

20 73; 93), said second programmable unit (44; 72; 92) being programmable by the 

second processor (P2), and 

- a second shareable unit (23; 86; 103) being connected to the processor bus (20; 80; 
100), 

wherein a first bi-directional communication channel is establishable between the first shareable 
25 unit (13; 76; 93) and the second processor (P2), and a second bidirectional communication 

channel is establishable between the second shareable unit (23; 86; 103) and the first processor 
(PI). 
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2» The system of claim I, wherein die first bi-directional communication channel and/or the 
second bidirectional communication channel are half-duplex channels or full-duplex channels. 

5 3. Tie system of claim 1, -wherein the processor (PI) and the processor (P2) are similar from an 
architectural point of view; 

4. The system of claim 1, wherein the processor (PI) and the processor (P2) are implementations 
of the same type of processor design, 

10 

5. The system of claim 1 , wherein die processor (PI) and the processor (P2) are implemcnxarions 
of different types of processor design. 

6, The system of the claims 1 - 5, wherein the shareable unit (13; 76; 93; 23; 86; 103) is either of 
15 the following: a memory, a peripheral, an interface, an input device, an output device. 

7, The system of rhc claims 1-5, wherein one of the two integrated processors (PI, P2) is a 
central processing unit (CPU), a microprocessor, a digital signal processor (DSP), a system 
controller (SC), a co-processor, or an auxiliary processor. 

20 

8. The system of die claims 1 - 5, wherein the first programmable unit 04; 82; 92) and/or the 
second programmable unit (44; 72; 92) comprises a processor interface (50; 60; 1 10; 120), a 
direct access unit core (52; 62; 112; 122), and two external direct memory access channel 
inrerfeces (5l> 53; 61; 63; 111; 113; 121; 123). " 

25 

9, Hie system of daim 8, wherein the processor interface (50; 60; 110; 120) has a programming 
link (12, 22; 32, 42; 51, 52; 74, 84; 94, 104) either for connecting to a processor bus (10, 20; 
70, 80; 90, 100) or for connecting to a processor (PI, P2). 



30 10, Hie system of any of the preceding claims, wherein the communication channels axe 

establishable for transferring dam and/or control information to and from the shareable unit 
(13; 76; 93; 23; 86; 103). 
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1 1 . A computing device comprising a first processor (Pi) and a second processor (P2) being 
arranged on a common semiconductor die and being operably connected via bi-directional 
communication channels for exchanging information, rfae computing device further comprising 

- a first processor bus (10; 70; 90), the first processor (P 1) being connectable to the first 
5 processor bus (10; 70; 90), 

- a first direct memory access unit (45s 83; 101) with a first external direct memory 
access channel (47; S5j 106), the first direct memory access unit (45; 83; 101) being 
connectable to the first processor bus (10; 70; 90), 

- a first programmable unit (34; 82; 92) being connectable via the first ©eternal direct 
10 memory access channel (47; 85; 106) to the first direct memory access unit (45; 83; 

101), said first programmable unit (34; 82; 92) being programmable by the first 
processor (PI), 

- a first shareable unit (13; 76"; 93) being connectable to die first processor bus (10; 70; 
5>0), 

15 - a second processor bus (20; 80; 100), the second processor (P2) being connectable to 

the second processor bus (20; 80; 100), 

- a second direct memory access unit (35; 73; 93) with a second external direct memory 
access channel (36; 75; 96), the second direct memory access unit (35; 73; 93) being 
connectable to the second processor bus (20; 80; 100), 

20 a second programmable unit (44; 72; 92) being cgnnecrable via the second external 

direct memory access channel (36"; 75; 96) to the second direct memory access unit (35; 
73; 93), said second programmable unit (44; 72; 92) being programmable by the 
second processor (P2), and 

a second shareable unit (23; 86; 103) being connected to the processor bus (20; 80; 
25 100). 

12. Hie computing device of daim 1 1 being part of a PDA, a handheld computer, a palm cop 
computer, a cellular phone, or a cordless phone. 
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ABSTRACT 

Improved Inter-Processor CWmunicadon System for Communicarian between Processors 

System comprising at least two integrated processors (PI and P2). These two processors (PI and P2) 
are operably connected via two bi-directional communication channels for exchanging information. 
5 For establishing the tu^Iirectional communication channels, the system comprises a first processor 
bus (10) to which the firsc processor (PI) is connected, a first direct memory access unit (45), a first 
programmable unit (34), and a first shareable unit (13). The programmable unit (34) can be 
programmed by die first processor (Pi). Also comprised is a second processor bus (20), the second 
processor (P2) being conneccable to the second processor bus (20), a second direct memory access 
10 unit (35), and a second programmable unit (44). Said second programmable unit (44) is 
pro gramma ble by the second processor (P2). 



1? (Figure 2) 
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